Rank | Count | Beginning |
---|---|---|
75 | 1579 | Ég |
16 | 1356 | Það |
60 | 997 | Í |
31 | 678 | En |
2 | 618 | Við |
118 | 513 | Hann |
114 | 509 | Þetta |
7 | 479 | Á |
103 | 386 | Þá |
77 | 305 | Svo |
519 | 300 | Og |
29 | 299 | Ef |
233 | 292 | Þar |
23 | 286 | Þegar |
85 | 272 | Hún |
193 | 255 | Að |
95 | 209 | Nú |
115 | 199 | Þeir |
164 | 192 | Með |
127 | 190 | Ekki |
12 | 186 | Er |
72 | 183 | Um |
214 | 171 | Einnig |
24 | 168 | Eftir |
94 | 168 | Mamma |
1 | 164 | Til |
125 | 161 | Mér |
50 | 147 | Þessi |
69 | 139 | Hér |
230 | 139 | Þú |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV